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Abstract 

In  [1]  we  introduced  a  class  of  multiscale  dynamic  models  described  in  terms  of  scale-recursive 
state  space  equations  on  a  dyadic  tree.  An  algorithm  analogous  to  the  Rauch-Tung-Striebel  algorithm- 
consisting  of  a  fine-to-coarse  Kalman-filter-like  sweep  followed  by  a  coarse-to-fine  smoothing  step- 
was  developed.  In  this  paper  we  present  a  detailed  system-theoretic  analysis  of  this  filter  and  of  the 
new  scale-recursive  Riccati  equation  associated  with  it.  While  this  analysis  is  similar  in  spirit  to  that 
for  standard  Kalman  filters,  the  structure  of  the  dyadic  tree  leads  to  several  significant  differences. 
In  particular,  the  structure  of  the  Kalman  filter  error  dynamics  leads  to  the  formulation  of  an  ML 
version  of  the  filtering  equation  and  to  a  corresponding  smoothing  algorithm  based  on  triangular- 
izing  the  Hamiltonian  for  the  smoothing  problem.  In  addition,  the  notion  of  stability  for  dynamics 
requires  some  care,  as  do  the  concepts  of  reachability  and  observability.  Using  these  system-theoretic 
constructs  we  are  then  able  to  analyze  the  stability  and  steady-state  behavior  of  the  fine-to-coarse 
Kalman  filter  and  its  Riccati  equation. 
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1  Introduction 

In  a  companion  paper  [1]  we  introduce  a  class  of  multiscale  state  space  models  evolv¬ 
ing  on  dyadic  trees  in  which  each  level  in  the  tree  corresponds  to  a  particular  level 
of  resolution  in  signal  representation.  Such  pyramidal  representations  for  signals  and 
images  have  been  and  continue  to  be  of  considerable  interest,  both  in  research  and 
in  application,  since  they  suggest  efficient  and  highly  parallelizable  computational 
structures  and  also  appear  to  be  natural  forms  of  representation  for  many  phenom¬ 
ena  including  those  with  fractal  or  self-similar  features.  The  framework  introduced 
in  [1]  had  as  its  motivation  the  development  of  a  rational  framework  for  statistical 
modeling  and  optimal  processing  based  on  such  pyramidal  representation,  and  the 
potential  of  this  framework  was  illustrated  in  [1]  both  for  problems  of  optimal  fu¬ 
sion  of  multiresolution  data  and  for  the  efficient  solution  of  computationally  intensive 
problems  of  signal  and  image  analysis  through  the  use  of  “fractal  regularization” 
techniques  based  on  our  models. 

One  of  the  other  contributions  of  this  work,  we  feel,  is  in  identifying  the  significant 
role  that  systems  and  control  researchers  can  have  in  this  area,  as  multiresolution  mod¬ 
eling  and  analysis  problems  have  a  strong  systems  flavor.  For  example,  the  optimal 
estimation  algorithm  [1]  can  be  viewed  as  a  direct  generalization  of  Kalman  filtering 
and  state  space  smoothing  algorithms,  introducing  a  new  class  of  scale-recursive  Ric- 
cati  equations.  This  suggests,  among  other  things,  the  development  of  a  theory  of 
multiresolution  modeling,  requiring  techniques  for  realization  and  identification,  and 
the  detailed  system- theoretic  analysis  of  the  filtering  algorithms  developed  in  [1].  The 
objective  of  this  paper  is  to  tackle  this  latter  problem,  while  an  initial  investigation 
of  multiscale  realization  theory  is  the  subject  of  [2]. 

In  the  next  section  we  briefly  review  the  multiscale  state  space  model  and  optimal 
estimation  algorithm  of  [1].  As  we  discuss,  the  objective  of  error  and  stability  analysis 
for  multiscale  filtering  leads  directly  to  a  variation  on  this  algorithm  which  we  develop 
in  Section  3.  This  “ML  algorithm”  also  has  a  direct  connection  with  the  solution 
of  the  estimation  problem  via  the  triangularization  of  the  smoothing  Hamiltonian, 
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which  we  describe  in  an  appendix.  In  Section  4  we  then  turn  to  the  system-theoretic 
analysis  of  our  models,  and  as  we  will  see,  the  notions  of  reachability,  observability, 
and  especially  stability,  have  significant  variations  as  compared  to  their  counterpart 
for  ordinary  state  space  models.  These  tools  are  then  used  in  Section  5  where  we 
analyze  the  properties  of  the  error  covariance  for  our  optimal  filter  and  the  stability 
and  asymptotic  behavior  of  the  filter  error  dynamics  and  our  new  Riccati  equation. 

2  State  Space  Models  and  Multiscale  Estimation 
on  Dyadic  Trees 

In  this  section  we  briefly  review  the  formulation  and  results  in  [1].  As  illustrated  in 
Figure  1,  the  basic  data  structure  for  multiresolution  modeling  is  the  dyadic  tree. 
Here  each  node  t  in  the  tree  T  corresponds  to  a  pair  of  integers  (m,n),  where  m 
denotes  the  scale  corresponding  to  node  t  and  n  its  translational  offset.  Thus,  if 
z(t)  denotes  a  signal  defined  on  T,  then  the  restriction  of  2  to  any  particular  level, 
i.e.  the  collection  of  values  of  z(t )  for  t  =  (m,  n)  with  m  fixed,  corresponds  to  the 
representation  of  a  signal  (viewed  as  a  function  of  n)  at  the  mth  scale.  As  illustrated 
in  the  figure,  it  is  useful  to  visualize  T  as  having  horizontal  levels  corresponding 
to  different  scales,  where  increasing  m  corresponds  to  moving  to  finer  resolutions. 
While  we  will  find  it  convenient  to  use  the  more  compact  notation  t  for  nodes  on  T, 
rather  than  the  scale-translation  pair  (m,n),  we  will  on  occassion  wish  to  refer  to  the 
scale  of  a  particular  node  t,  which  we  denote  by  m(t).  Also,  again  as  illustrated  in 
the  figure,  we  will  define  our  dynamic  operations  in  terms  of  basic  shift  operators, 
namely  the  unique  backward  shift  7  and  two  forward  shifts  a  and  /?.  In  particular  if 
t  =  (m,  n),  then  ta  =  (m  +  1, 2ra),  t(3  =  (m  +  1, 2n  +  1),  and  Py  =  (m  —  1,  [f]).  The 
basic  picture  one  should  have  is  that  finer  scales  introduce  additional  detail  into  the 
signal  representation,  while  coarser  scales  involve  successively  decimated  and  lower 
resolution  (e.g.  low-pass  filtered)  representation  (sec  [1]  for  further  discussion  and 
references). 

There  are  two  alternate  classes  of  scale- recursive  linear  dynamic  models  that  are 
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of  interest.  The  first  of  these  is  the  class  of  coarse-to-fine  state  space  models  on  T  : 


x(t)  =  A{t)x{i^)  +  B(t)w(t) 


(2.1) 


y(t)  =  C(t)x(t)  +  v(t)  (2.2) 

The  term  A{t)x(t7))  in  (2.1)  represents  a  coarse-to-fine  prediction  or  interpolation, 
B(t)w(t)  represents  the  higher  resolution  detail  added  in  going  from  one  scale  to  the 
next  finer  scale,  and  y(t)  is  the  measured  variable  (if  any)  at  the  particular  scale  m 
and  location  n  represented  by  t.  This  model  serves  as  the  basis  for  the  multiscale 
modeling  of  stochastic  processes  developed  in  [1] .  In  contrast  the  fine-to-coarse 
Kalman  filtering  step  of  our  estimation  algorithm  falls  into  the  class  of  fine-to-coarse 
recursive  models  of  the  form 

x(t)  —  Fi(ta)x(ta)  +  F2(tf3)x(t /?)  +  G(ta)w(ta)  +  G(t/3)w(t/3)  (2.3) 

Note  that  the  general  models  (2.1)-(2.3)  allow  full  t-dependence  of  all  the  system 
matrices,  and  several  of  the  applications  described  in  [1]  require  this  general  depen¬ 
dence.  An  important  special  case  is  that  in  which  the  system  parameters  are  constant 
at  each  scale  but  may  vary  from  scale  to  scale,  in  which  case  we  abuse  notation  by 
writing  A(t)  =  A(m(t)),  etc.  Such  a  model  is  useful  for  capturing  scale- dependent 
effects  and  fractal  behavior.  For  simplicity  we  focus  the  detailed  covariance  analysis 
and  stability  results  on  this  case,  while  our  investigation  of  steady-state  behavior,  of 
course,  looks  at  the  further  specialization  to  constant -parameter  models. 

In  [1]  we  analyze  the  second-order  statistics  of  (2.1)  when  w(t )  and  v(t )  are  inde¬ 
pendent,  zero-mean  white  noise  processes  with  covariances  /  and  R(t),  respectively. 
We  also  assume  that  w(t)  is  independent  of  the  “past”  of  x ,  i.e.  {a;(r)|m(r)  <  m(t)}. 
Also,  if  we  wish  to  consider  representations  of  signals  of  unbounded  extent,  we  must 
deal  with  the  full  infinite  tree  T,  i.e.  {(to,  n)|  —  oo  <  m,n  <  oo}.  This  will  be  of 
interest  when  we  consider  asymptotic  properties  such  as  stability  and  steady-state 
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behavior.  In  any  practical  application,  of  course,  we  must  deal  with  a  compact  inter¬ 
val  of  data.  In  this  case  the  index  set  of  interest  represents  a  finite  version  of  the  tree 
of  Figure  1,  consisting  of  M  +  1  levels  beginning  with  the  coarsest  scale  represented 
by  a  unique  root  node,  denoted  by  0,  and  M  subsequent  levels,  the  finest  of  which 
has  2m  nodes. 

The  covariance  Px(t)  =  E[x(t)xT (<)]  evolves  according  to  a  Lyapunov  equation  on 
the  tree: 

Px(t)  =  A(t)Px(tj)AT{t)  +  B{t)BT{t)  (2.4) 

In  the  scale- varying  model,  i.e.  the  case  in  which  the  model  parameters  vary  in  scale 
only,  if  at  some  scale  Px(t)  is  constant,  then  this  holds  at  each  scale,  so  that  by  an 
abuse  of  notation  Px(t)  =  Px(m(t)),  and  we  have  a  scale-to-scale  Lyapunov  equation: 

Px(m  +  1)  =  A(m)Px(m)AT(m)  +  B(m)BT(m)  (2.5) 

If  we  further  specialize  our  model  to  the  case  in  which  A  and  B  are  constant,  and 
if  A  is  stable,  then  (2.5)  admits  a  steady-state  solution,  to  which  Px(m )  converges, 
which  is  the  unique  solution  of  the  usual  algebraic  Lyapunov  equation: 

Px  =  APX  AT  +  BBt  (2.6) 

In  our  development  and  analysis  of  smoothing  algorithms,  we  encounter  the  need 
for  fine-to-coarse  prediction  and  recursion.  In  particular,  the  reversal  of  (2.1),  i.e.  a 
model  representing  x(Pf)  as  a  linear  function  of  x(t)  and  a  noise  that  is  uncorrelated 
with  x(i)  is  given  by 

x(tq)  =  F(t)x(t)  —  yi-1(t)i?(t)ui(t)  (2.7) 


with 


and  where 


F(t)  =  A~'W  -B(t)BT(t)P;'(t)\ 
=  P,(ti)AT(t)Px-'(t) 


(2.8) 


2  STATE  SPACE  MODELS 


5 


w(t)  =  tu(£)  —  _E[u?(tf)|x(f)]  (2.9) 

£Wf)t2>r(f)]  = 

=  Q(t )  (2-10) 

In  [1]  we  derive  a  generalization  of  the  Rauch-Tung-Striebel  smoothing  algorithm 
consisting  of  a  fine-to-coarse  Kalman  filtering  step  followed  by  coarse-to-fine  smooth¬ 
ing  step.  Specifically,  let  i(s|£)  denote  the  optimal  estimate  of  x(s)  based  on  data  at 
or  “below”  node  t  (i.e.  yir)  for  r  =  t  or  r  a  descendant  of  t),  and  let  x(s|£+)  denote 
the  optimal  estimate  of  x(s)  based  on  data  strictly  “below”  t  (i.e.  y{r)  for  r  a  strict 
descendent  of  t).  Let  P(s\t)  and  P(s\t+)  be  the  corresponding  error  covariances. 
Then  the  coarse-to-fine  Kalman  filter  consists  of  a  measurement  update  step 


x(t\t)  =  x(t\t+)  +  K(t)[y(t)  -  C(t)x(t\t+)}  (2.11) 

I<(t)  =  P{t\t+)CT{t)V-l{t)  (2.12) 

V{t)  =  C{t)P(t\t+)CT{t)  + R(t)  (2.13) 

P(t\t)  =  [I-  K{t)C{t)]P(t\t+ )  (2.14) 

a  coarse-to-fine  one-step  prediction  step: 

x{t\ta)  —  F(ta)x(ta\ta)  (2.15) 

x(t\t/3)  =  F(tf3)x(tf3\t(3)  (2.16) 

with  corresponding  error  covariances  given  by 

P(£|£a)  =  F{ta)P{ta\ta)FT{ta)  +  Q{ta)  (2.17) 

Q(ta)  =  A-1(ta)B{ta)Q(ta)BT(ta)A~T{ta )  (2.18) 

P(tm  =  F(t/3)P(t0\tl3)FT(0)  +  am  (2.19) 

Q(t/3)  =  A-‘it.0)B(i:r)QU;r)BTll3)A-THB)  (2.20) 


and  a  fusion  step  to  merge  the  estimates  (2.15)  and  (2.16),  to  form  x(t\t+): 

x(t\t+)  —  P(£|£+)[P-1(£|£a)x(£|£o:)  +  P_1(£|£/9)x(£|£/?)] 
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(2.21) 

P(t\t+)  =  [P-\t\t*)  +  P-\t\tf3)-P;\t)}-'  (2.22) 

This  filtering  algorithm  has  an  obvious  pyramidal  structure,  allowing  substantial 
parallelization.  Note  that  while  the  update  and  prediction  steps  are  analogous  to  the 
corresponding  steps  in  usual  Kalman  filtering  (although,  as  discussed  in  [1],  this  step 
must  proceed  from  fine-to-coarse  and  hence  must  use  the  backward  model  (2.7)  for 
the  prediction  step),  the  fusion  step  has  no  counterpart  in  the  standard  case,  and, 
as  we’ll  see  this  leads  to  some  interesting  differences  in  our  analysis  of  the  filtering 
algorithm. 

Finally,  let  xs(t)  denote  the  optimal  estimate  of  x(t )  based  on  all  available  data  on 
a  finite  subtree  with  root  node  0  and  M  scales  below  it.  Once  the  Kalman  filter  has 
reached  the  root  node  at  the  top  of  the  tree,  we  have  computed  a:3(0)  =  x(0|0),  which 
serves  as  the  initial  condition  for  the  coarse-to-fine  RT S  smoothing  sweep  which  also 
has  a  parallel,  pyramidal  structure  : 

x3(t)  =  x(t\t)  +  J(t)  [xs{tl)  ~  z(*7l*)]  (2.23) 

J(t)  =  P(t\t)FT(t)p-\tj\t)  (2.24) 

where  Ps(t),  the  smoothing  error  covariance,  satisfies 

Ps(t)  =  P(t\t)  +  J(t)[Ps(t 7)  -  P(ti\t)]JT(t)  (2.25) 

3  The  ML  Filter 

The  fine-to-coarse  filtering  equations  presented  in  the  preceding  section  have  several 
significant  differences  with  standard  Kalman  filtering  analysis  and  present  some  dif¬ 
ficulties  in  analysis  that  provide  motivation  for  a  slightly  different  algorithm.  Specif¬ 
ically  the  Riccati  equation  (2.12)-(2.14),  (2.17)-(2.20),  (2.22)  for  our  optimal  filter, 
differs  from  standard  Riccati  equations  in  two  respects:  1)  the  explicit  presence  of 


3  THE  ML  FILTER 


7 


the  prior  state  covariance  Px(t)  and  2)  the  fusion  of  two  sources  of  information  in 
(2.22).  The  latter  of  these  is  intrinsic  to  our  Riccati  equations  and,  as  we  will  see, 
has  important  consequences  in  the  stability  analysis  of  fine-to-coarse  filtering.  The 
presence  of  Px(t),  on  the  other  hand,  points  to  an  apparent  complication  in  analyzing 
our  filter  that  motivates  an  alternate  filtering  algorithm  in  which  Px  does  not  appear. 
Specifically,  in  standard  Kalman  filtering  analysis  the  Riccati  equation  for  the  error 
covariance  can  be  viewed  simply  as  the  covariance  of  the  error  equations,  which  can 
be  analyzed  directly  without  explicitly  examining  the  state  dynamics  since  the  error 
evolves  as  a  state  process  itself.  This  is  apparently  not  the  case  here  because  of  the 
explicit  presence  of  Px(t)  in  (2.22)  and  in  the  backward  model  parameters  (2.7)-(2.10) 
that  enter  into  the  fine-to-coarse  prediction  step  (2.17)-(2.20).  On  first  examination, 
this  might  not  appear  to  be  a  new  problem,  as  backward  models  for  standard  tem¬ 
poral  models  also  involve  the  state  covariance.  However,  the  present  situation  is  not 
as  simple.  First  of  all,  as  discussed  in  [1],  the  driving  noises  in  (2.7)  are  not  white 
(except  along  fine-to-coarse  paths).  Also,  and  more  importantly,  the  new  fusion  step 
adds  a  new  twist.  In  particular,  if  we  examine  the  backward  model  (2.7)-(2.10)  and 
the  Kalman  filter  (2.11),  (2.15),  (2.16),  (2.21),  we  find  that  the  upward  dynamics 
for  the  error  x(t )  —  x(t\t)  are  not  decoupled  from  x(t)  unless  i3~1(t)  =  0.  Thus 
we  apparently  have  a  significant  difference  in  analyzing  these  error  dynamics,  and, 
in  particular,  their  stability.  To  overcome  this,  we  consider  a  slight  variation  in  the 
filtering  and  RTS  algorithm. 

Specifically,  we  define  what  we  will  refer  to  as  the  ML  filter  by  setting  the  P~1(t) 
terms  in  (2.11)-(2.22)  to  zero.  The  resulting  filter  recursions  are  then  given  by 


Measurement  Update: 

XjML(t\t+)  +  KML{t)[y{t)  —  C(t)xML(t\t+)} 

(3.1) 

Kml(1)  — 

PML{t\t+)CT{t)V^L{t) 

(3.2) 

Vml(1)  = 

C(t)PML(t\t+)CT(t )  +  R(t) 

(3.3) 

PML{t\t )  = 

[/  —  KML(t)C(t)]PML(t\t+) 

(3.4) 
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One-step  Prediction: 

=  A~x{t)xML{t\t)  (3.5) 

Pml{Li  10  =  A-1(f)PMjL(f|i)A~r(t)  +  A~1(t)B(t)BT(t)A~T(t )  (3.6) 

Merge  Step: 

r.ML\l\t+)  =  Pm L(t\t+j{P^lL(t\ta!:£^L(t\to:j  ^  Pm,  {t\tfi)xML{l\t 8 ) j  (3.7) 

PH'Mt+)  =  P^Mtc)  +  P£,(t \tfi)  (3.8) 

The  key  difference  here  are  the  absence  of  a  Pj'1(f)  term  in  (3.8)  (compare  to 

(2.22) ),  and  the  changes  to  the  prediction  step  (3.5)-(3.6)  due  to  the  simpler  form  of 
the  backward  model  (2.7)-  (2.10)  when  P '^(t)  =  0. 

As  shown  in  Appendix  A,  XML{t\t )  has  the  interpretation  as  the  ML  estimate  of 
x(t),  viewed  as  an  unknown  vector,  based  on  the  measurements  Yt.  Thus  the  Bayesian 
estimate  of  the  preceding  section  and  its  covariance  can  be  computed  as  follows: 

x(t\t)  =  P(t\t)Pj^i(t\t)xML(t\t)  (3.9) 

p-'m  =  p&m  +  p;\t)  (3.io) 

Note  that  this  provides  us  with  an  alternative  RTS-like  algorithm:  we  apply  the 
fine-to-coarse  ML  filter  (3.1  )-(3.8)  from  the  finest  scale  M  up  to  the  top  of  the  tree, 
i.e.  through  the  computation  of  x^(0|0),  Pml(0|0).  We  then  incorporate  prior 
information  at  the  top  of  the  tree,  using  (3.9),  (3.10)  to  yield  x3(0)  =  rr(0|0)  and 
Ps(0)  =  P(0|0).  The  downward  smoothing  sweep  is  then  computed  by  adapting 

(2.23) -  (2.25)  (using  (3.9),  (3.10))  so  that  the  ML  estimator  computed  in  the  ML 
filtering  sweep  are  used  in  the  smoothing  step.  Specifically,  as  shown  in  Appendix  A 

Xa(t)  =  XML(t\t)  +  J(t)[xs(tl)  -  XML{tl\t )]  (3-11) 


Ps{t)  =  PML{t\t)  +  J(t)[Ps(t 7)  -  PML(tl\t)\JT(t) 


(3.12) 
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where 

j(t)  =  PMLmA-T(t)Pu\(n\t)  (3.13) 

Also  as  in  standard  Kalman  filtering,  the  ML  filtering  equations  (3.1)-  (3.8)  cannot 
be  directly  used  at  the  initial  levels  of  recursion —  i.e.  for  the  finest  level  M  and 
perhaps  several  levels  above  this— until  the  ML  covariance  is  well-defined.  Rather  the 
information  form  of  this  filter  must  be  used,  and  this  is  also  described  in  Appendix 
A.  Note  that  as  one  might  expect  and  as  will  be  used  in  Section  5,  observability  plays 
a  central  role  in  guaranteeing  that  the  error  covariance  does  become  well-defined. 
Also,  in  Appendix  B  we  present  an  alternate  viewpoint  for  the  derivation  of  RTS-like 
algorithms,  namely  through  analysis  of  the  Hamiltonian  equations  for  our  estimation 
problem.  The  Hamiltonian  and  the  two-point  boundary-value  problem  associated 
with  it  plays  a  central  role  in  the  theory  of  smoothing  for  standard  state  space  models. 
For  example,  as  discussed  in  [6],  [7],  triangularization  of  the  Hamiltonian  leads  to  two- 
filter  smoothing  algorithms,  while  triangularization  leads  to  the  RTS  algorithm.  In 
our  case,  the  structure  of  the  tree  adds  a  fundamental  asymmetry  to  the  Hamiltonian, 
which  precludes  diagonalization,  but  whose  triangularization  is  possible,  leading  to 
the  ML  form  of  the  RTS  algorithm  we  have  just  described.  This  is  developed,  for 
simplicity  in  the  constant-parameter  case  in  Appendix  B. 

Finally,  as  mentioned  at  the  beginning  of  this  subsection  one  of  the  motivations  for 
introducing  the  ML  filter  is  that  its  use  allows  us  to  obtain  a  dynamic  representation 
for  the  filtering  error  that  is  decoupled  from  the  state  dynamics  itself.  Specifically, 
from  (3.1)-(3.8)  we  can  derive  the  following  ML  filter  recursion 


XML(t\t) 

+  p^L{t\tf3)A-xmxmm  +  )  (3.14) 


Also,  from  (2.1) 


x(t)  =  A  1(ta)x(ta)  —  A  1  (ta)B(ta)w(ta) 


(3.15) 
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x(t)  =  A  —  A  1(t/3)B(t^)w(t/3)  (3.16) 

and  thus,  using  (3.8) 


x(t)  =  PML{t\t+)[PMlL{t\ta)A  l(toc)x(ta)  +  PM\{t\t(3)A  1(t0)x(t/3)\ 
-^ML(i|^+)[iM1L(t|io:)A_1(to)5(ta)u7(iQ;) 

+  Pj&(t\t0)A-\0)Bm*>m]  (3-17) 

and  thus  defining  XML(t\t)  =  x(t)  —  XMi,(t\t),  we  obtain 


xML(t\t) 

-  [I  -  KML{t)C(t)\PML(t\t+){PML{t \toi)A~l{ta)x(ta\tot)  +  P^(t|t/9)A_1(t/3)z(^|^)] 

-  PML(^|i+)[P^1L(t|to)/l_1(tQ;)5(to)u;(to)  +  P^1L(t|t^)i4"1(t/S)5(^)u;(^)] 

-  KML{t)v{t)  (3.18) 

Note  that  (3.14),  (3.17),  and  (3.18)  each  represents  a  fine-to-coarse  system  of  the 
form  of  (2.3),  and  in  particular,  (3.18)  represents  the  filtering  error  as  the  state  of 
such  a  system  driven  by  white  process  and  measurement  noise.  It  is  the  stability  of 
this  system-in  the  scale- varying  case-that  is  investigated  in  Section  5. 

4  System— Theoretic  Concepts  for  Fine— To— Coarse 

Dynamic  Models 

In  this  section  we  introduce  and  investigate  several  system-theoretic  concepts  for 
dynamic  systems  on  dyadic  trees.  The  structure  of  the  tree  leads  to  several  important 
differences  with  standard-state  space  system  theory,  and  furthermore  this  setting 
appears  to  be  a  natural  one  in  which  to  develop  a  theory  for  multiresolution  modeling 
and  realization.  Our  goals  here,  however,  are  far  more  modest.  In  particular  we  refer 
the  reader  to  [2]  for  a  first  step  in  developing  such  a  multiscale  realization  theory  and 
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focus  here  on  the  specific  constructs  and  results  needed  in  Section  5  for  the  asymptotic 
analysis  of  the  fine-to-coarse  filtering  algorithm  described  in  the  preceding  section. 

In  particular,  we  focus  here  on  the  fine-to-coarse  model  (2.2),  (2.3).  Moreover, 
the  analysis  of  Section  5  focuses  for  scale-varying  systems,  and  thus  we  focus  here  on 
the  analogous  specialization  of  (2. 2), (2. 3),  namely 

x(t)  —  F(m(t)  +  l)[a?(£a)  +  x(t/3)\  +  G(m(t)  +  l)[zo(ta)  +  w(t/3)]  (4.1) 

y(t)  =  C{m(t))x(t)  (4.2) 

where,  since  we  focus  in  this  section  on  deterministic  properties,  w(t)  in  (4.1)  should 
be  viewed  as  an  input,  and  we  have  eliminated  the  measurement  noise  from  the 
observation  (4.2).  Furthermore,  to  simplify  the  discussion  we  assume  the  F(m)  is 
invertible  for  all  m. 

4.1  Reachability  and  Observability 

The  first  property  we  wish  to  investigate  is  reachability  for  the  model  (4.1),  i.e. 
the  ability  to  drive  the  system  from  any  fine-scale  initial  condition  to  any  coarse- 
scale  target.  Note  that  the  number  of  descendent  nodes  below  any  node  t0  grows 
geometrically  with  scale —  i.e.,  there  are  2  descendants  one  scale  finer  than  t0,  4 
descendants  two  scales  finer,  etc.  Thus  there  are  2m  “initial  conditions”  affecting 
x(t0)  and  at  a  scale  M  levels  finer  than  a;(t0)-  Thus  let  us  define  the  following  vectors, 

Xmm  =  [xT(t0aM),xT(t0^aM~1),,..xT(t0/3M)]T  (4.3) 

Wm,i o  =  [  wT(t0a)  wT(t0f3 )  ...  wT(t0aM)...wT(t0f3M)  ]T  (4.4) 

The  vector  Xm,^  denotes  the  vector  of  2M  points  at  the  M th  level  down  that  influence 
the  value  of  x(t0).  The  vector  Wmjo  comprises  the  full  set  of  inputs  that  influences 
x(t0 )  starting  from  initial  condition  Xmjoi  i-e-  the  w(t)  in  the  entire  subtree  down  to 
M  levels  from  to. 
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Definition  4.1  The  system  (4-1)  is  reachable  from  Xmm  to  x(t0 )  if  given  any 
Xmjq  and  any  desired  x(t0),  it  is  possible  to  specify  0  so  tf  ^M,t0  =  X  m, to  > 
then  x(to)  =  x(to). 

As  always,  in  studying  conditions  for  reachability,  we  can  set  Xj \ftto 
case 

x(t0)  =  QWmj  o 

where 


g 

A 

[  tf(0)  $(0)  tf(l) 

tf(l)  tf(l)  tt(l)  ... 

(4.6) 

>)  $(M-1)...®(M-1)  , 

2m~1  times 

2m  times 

*(0 

A 

<f>(m(t0),m(t0)  +  i)G{ 

m(tQ)  +  f  +  l) 

(4.7) 

A 

l1 

mi  =  m2 

(4.8) 

[  F(mi  +  l)^(mi  +  l,m2)  mi  <  m2 

<f>(m  —  1,  m) 

A 

F(m) 

(4.9) 

Let  us  define  the  reachability  Gramian 

K(t0,M)  =  ggT 

M— 1 

=  T+1<j)(m(to),m(ta )  +  i)G(m(t0)  +  i  +  1) 

t=0 

x  GT(m(t0)  +  i  +  \)<j)T  (m(tf),m{tQ)  +  i)  (4.10) 

Thus  since  the  rank  of  g  equals  the  rank  of  ggT,  we  see  that  the  system  (4.1)  is 
reachable  from  XM,t0  1°  x(to)  if  is  invertible.  Also  we  can  now  define  a 

notion  of  uniform  reachability: 

Definition  4.2  The  system  (4-1)  is  uniformly  reachable  if  there  exists  7,  M0  >  0 
so  that 


=  0,  in  which 
(4.5) 
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Note  that  7Z(t0,  M )  bears  a  strong  similarity  to  the  standard  reachability  gram- 
mian  for  the  following  system. 

x(m)  =  \p2F{rn  +  1  )x(m  +  1)  +  \/2G(m  +  1  )u(m  +  1)  (4.12) 

Furthermore  the  factor  of  \J2  in  (4.12)  certainly  affects  the  absolute  magnitude  of  the 
reachability  gramian,  but  it  does  not  affect  either  reachability  or  uniform  reachability. 
Thus,  the  usual  conditions  for  temporal  state  space  models  apply  here  as  well.  For 
example,  if  F  and  G  are  constant,  then  reachability  and  uniform  reachability  are 
equivalent  to  the  usual  condition,  i.e.  rank[G\FG\...\Fn~1G]  =  n. 

It  is  interesting  to  note  that  the  structure  of  the  tree  adds  a  substantial  level 
of  asymmetry  to  the  analysis  of  coarse-to-fine  and  fine-to-coarse  systems.  For  ex¬ 
ample,  for  standard  temporal  systems  there  are  two  closely  related  notions,  namely 
reachability-  i.e.,  the  ability  to  reach  any  state  from  any  state — -and  controllability — 
i.e.,  the  ability  to  reach  zero  from  any  state.  If  the  state  dynamic  matrix  is  invertible 
these  are  equivalent,  and  this  is  also  true  for  the  fine-to-coarse  model  (4.1).  However, 
this  is  not  true  for  the  coarse-to-fine  model  ( e.g .  (2.1)  or  it  s  scale- varying  specializa¬ 
tion).  In  particular,  reachability  for  a  coarse-to-fine  model  involves  driving  a  single 
initial  condition  x(t0)  to  any  possible  value  of  the  2M-point  set  of  values  in  Xmjo- 
This  is  an  extremely  strong  condition,  in  contrast  to  the  condition  of  controllability, 
i.e.  driving  x(t0)  to  Xm,i0  =  0.  While  this  is  of  no  direct  interest  to  us  here  (and  we 
refer  the  reader  to  [9]  for  details),  the  dual  of  this  property  is. 

Specifically,  let  us  turn  to  the  problem  of  determining  the  state  given  the  knowl¬ 
edge  of  the  input  and  output.  In  the  standard  temporal  case,  there  are  two  notions — 
observability  (i.e.  the  ability  to  determine  the  initial  condition)  and  reconstructibility 
(i.e.  the  ability  to  determine  the  final  state) — which  coincide  if  the  state  dynamic 
matrix  is  invertible.  The  asymmetry  of  the  tree  certainly  leads  to  a  substantial 
difference  for  us.  For  coarse-to-fine  dynamics,  observability — i.e.  determining  the 
single  coarse  state  from  the  subtree  of  data  beneath  it— is  a  much  weaker  notion  than 
reconstructibility — i.e.  determining  the  2M  states  at  a  fine  scale  based  on  the  subtree 
of  data  above  it.  The  exact  opposite  conditions  hold  for  the  fine-to-coarse  model 
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(4.1),  (4.2) — i.e.  reconstructing  x(t0)  based  on  the  subtree  of  data  below  it  is  a  much 
weaker  condition  than  determining  the  2M  states  in  Xm^o  based  on  the  data  in  the 
subtree  above  it.  Fortunately  for  us,  it  is  the  weaker  of  these  notions  that  we  require 
here.  Thus  we  focus  on  that  case  here  and  refer  the  reader  to  [9]  for  a  full  treatment. 
Let  us  define 

YM,to  =  [  yT(to)\  yT(toa),  yT(iofi) |  ...  \yT(t0aM),...yT(t0/3M)  f  (4.13) 

Definition  4.3  The  system  (4-1), (4-2)  is  reconstructible  from  Xm,i0  to  x(t0 )  if 
given  knowledge  ofWM,t0  and  tjtoi  we  can  uniquely  determine  x(t0). 

As  always  in  studying  reconstructibility  and  observability,  superposition  allows  us 
to  focus  on  the  case  when  Wm^o  —  0  in  which  case 

Ym,  t0  —  TLmXmm  (4-14) 

where  Hm  is  most  easily  visualized  if  we  partition  it  compatibly  with  the  levels  of 
the  observations  in  Ymjo  '■ 
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Hm  = 


2m  blocks 


0(0)  0(0) 

... 

... 

0(0) 

0(1)  ... 

. . . 

0(1) 

0 

... 

0 

0 

... 

0 

0(1)  ... 

... 

0(1) 

0(2)  ... 

0(2) 

0 

0 

0 

0 

0 

0 

0 

0 

0(2)  ... 

0(2) 

0 

0 

0 

0 

0 

0 

0 

0 

6(2)  ... 

0(2) 

0 

0 

0 

0 

0 

0 

0 

0 

0(2)  ... 

0(2) 

Q(M)  0 
0  0(Af)  ... 


0 

0 

0{M) 

(4.15) 


'  *  \  A 


0(0  =  C(m(t0)  +  i)<f>(m(t0)  +  i,m(t0)  +  M) 


Here 


(4.16) 
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As  a  simple  example  to  help  clarify  the  structure  of  the  matrix  TLm  consider  the 
matrix  for  the  scale-invariant  case,  i.e.  where  F(m )  =  F,  C(m)  =  C. 


CF 2 

CF2 

CF2 

CF2 

CF 

CF 

0 

0 

0 

0 

CF 

CF 

C 

0 

0 

0 

0 

C 

0 

0 

0 

0 

C 

0 

0 

0 

0 

C 

(4.17) 


That  is,  at  level  i,  there  are  2l  measurements  each  of  which  provides  information  about 
the  sum  of  a  block  of  2M~‘  components  of  Xmm  ■  Note  that  this  makes  clear  that 
upward  observability  is  indeed  a  very  strong  condition.  Specifically,  since  successively 
larger  blocks  of  Xu,t 0  are  summed  as  we  move  up  the  tree,  subsequent  measure¬ 
ments  provide  no  information  about  the  differences  among  the  values  that  have  been 
summed.  However,  the  situation  for  reconstructibility  is  very  different.  Specifically,  if 
WM,t0  =  0,  then 

x(t0)  =  <f>(m(tQ),m(t0)  +  M)IMXM,t0  (4.18) 

where 

IM  =  [£|£W£J  (4.19) 

2m  times 

and  each  I  is  an  n  x  n  identity  matrix. 

Since  the  condition  of  reconstructibility  only  requires  being  able  to  uniquely  de¬ 
termine  the  single  point  x(t0)  from  the  measurements  in  the  subtree,  we  guarantee 
this  condition  by  requiring  that  any  vector  in  the  nullspace,  of  (4.14)  is  also  in  the 
nullspace  of  (4.18).  Since  is  invertible,  this  is  equivalent  to  being  able  to 

uniquely  determine  ImXmj 0,  he.  the  sum  of  the  components  of  Xmjq  (which  is  all 
that  affects  x(t0))  from  Ym, t0-  We  then  have 


Theorem  4.1  The  system  (4-1),  (4.2)  is  reconstructible  iff  N(Hm)  Q  W(/m); 
which  is  equivalent  to  the  invertibility  of  the  reconstructibility  gramian  O(t0,M): 
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O(t0,M)  —  Im'Hm'H.mIm 

M 

=  ^  22M_‘ 4>t (m(t0)  +  i,  m(t0 )  +  M)CT(m(t0 )  +  i) 

i= o 

x  C(m(t0)  +  i)<f>(m(t0)  +  i,m(t0)  +  M)  (4.20) 

Proof:  We  need  only  show  that  M(Hm)  Q  ■N'(Im)  is  equivalent  to  the  invertibility 
of  O(t0,M)  =  Im'H'm'HmIm-  Suppose  first  that  O(t0,M )  is  not  invertible.  Then 
there  exists  y  ^  0  so  that  7 Imz  =  0  where  z  —  if^y-  Note  that  I'm  is  one-to-one 
so  that  z  ^  0  which  in  turn  implies  that  Imz  =  ImImV  7^  0  which  contradicts 
M(Hm)  £  JY(Im)-  If  on  the  other  hand  M(Hm)  is  not  included  in  W(2 m),  then 
there  exists  an  x  such  that  TLmx  —  0  and  Imx  ^  0.  Since  xe'R.(lJ/I(to))  ©  X(Im(Io)), 
we  can  write  x  =  Ij^y  +  2  where  y  ^  0  and  zcJC{Im)-  Making  this  substitution  into 
Hmx  =  0  and  left- multiplying  by  ImTLm,  we  get 

ImHJ^HmImV  +  Im'Hm'Hmz  =  0  (4-21) 

However,  a  straightforward  but  tedious  calculation  [9]  yields 

Im'Hm'H-m  =  A  tIm  (4.22) 

where  A  is  an  nxn  matrix  computed  from  the  elements  of  TLm  in  (4.15).  Equation 
(4.22)  is  a  consequence  of  the  special  structure  of  TC^TCm-  In  particular  it  indicates 
that  the  columns  of  1 Jj  form  a  block-eigenspace  for  7iJ,f  Hm ■  Indeed,  as  discussed  in 
detail  in  [9],  TC'm'Hm  is  block-diagonalized  by  the  (vector)  Haar  transform,  and  (4.22) 
represents  the  coarsest  scale  component  of  that  transform. 

If  we  now  substitute  (4.22)  into  (4.21)  and  use  the  fact  that  zeM  (Hm),  we  see  that 
$(to)'HM'HM$t(to)y  =  0  for  some  y  ^  0,  implying  that  yT$(to)H'M'H.M$t(to)y  =  0, 
contradicting  the  invertibility  of  O(t0,  M) 

□ 

Also,  as  in  the  case  of  reachability,  it  is  useful  to  define  a  notion  of  uniformity: 

Definition  4.4  The  system  (4-1),  (4-2)  is  uniformly  reconstructible  if  there  ex¬ 
ists  8 ,  M0  >  0  so  that 


0(t,  Mo)  >  81  for  all  t 


(4.23) 
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Furthermore,  note  that  O(t0 ,  M)  is  the  standard  observability  gramian  for  the  fol¬ 
lowing  system. 


a:(m)  =  \:F{m  +  l)a;(m  +  1)  +  - -G(m  +  1  )u(m  +  1)  (4.24) 

y(m )  =  \/2 C{m)x(m)  (4.25) 

Thus  as  for  reachability,  the  conditions  for  reconstructibility  and  uniform  recon- 
structibility  for  our  model  is  the  same  as  the  usual  notions  for  the  pair  F(m),C(m). 
For  example  if  F  and  C  are  constant,  then  (since  F  is  assumed  to  be  invertible), 
reconstructibility  and  uniform  reconstructibility  are  equivalent  to  the  usual  condition 
for  F  and  C  to  be  an  observable  pair. 

4.2  Stability 

Finally,  let  us  examine  the  question  of  asymptotic  stability  for  the  autonomous  version 
of  (4.1),  i.e. 


z(t)  =  F(m(t)  +  l)[2r(ta)  +  z(0)]  +  Q(m(t))u(t)  (4.26) 

as  the  dynamics  propagate  up  the  tree.  Intuitively  what  we  would  like  stability  to 
mean  is  that  z(t)  — »  0  as  we  propagate  farther  and  farther  away  from  the  initial 
level  of  the  tree.  Note,  however,  that  as  we  move  up  the  tree,  z(t)  is  influenced  by 
a  geometrically  increasing  number  of  nodes  at  the  initial  level.  For  example,  z{t) 
depends  on  {z(ta),  z(t/3)}  or,  alternatively  on  {z{ta2),  z(t/3a),  z(ta(3),  z(t/32)},  etc. 
Thus  in  order  to  study  asymptotic  stability  it  is  necessary  to  consider  an  infinite 
dyadic  tree,  with  an  infinite  set  of  initial  conditions  corresponding  to  all  nodes  at  the 
initial  level. 

For  the  remainder  of  this  discussion,  we  adopt  a  change  of  notation  to  a  more 
standard  form  for  stability  analysis  of  dynamic  systems.  Specifically,  we  change  the 
sense  of  our  index  of  recursion  so  that  m  increases  as  we  move  up  the  tree.  In 
particular  we  arbitrarily  choose  a  level  of  the  tree  to  be  our  “initial”  level,  i.e.  level 
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0,  we  now  index  the  points  on  this  initial  level  as  z,(0)  for  i  £  Z.  Points  at  the  mth 
level  up  from  level  0  are  denoted  zt(m)  for  i  £  Z.  The  dynamical  equation  we  then 
wish  to  consider  is  of  the  form 

Zi(m)  =  F(m  -  1  )(z2i(m  -  1)  +  z2i+i(m  -  1))  (4.27) 

Let  Z(m)  denote  the  infinite  sequence  at  level  m,  i.e.  the  set  {z,(m)  ,  i  6  Z}. 
The  p-norm  on  such  a  sequence  is  defined  as 

l|Z("OII,  =  (Ell*MIIJ)*  (4-28) 

i 

where  ||z,-(m)||p  is  the  standard  p-norm  for  the  finite  dimensional  vector  z,(m). 

Definition  4.5  A  system  is  lp- exponentially  stable  if  there  exists  0  <  a  <  1  and 
C  >  0  so  that  given  any  initial  sequence  Z( 0)  such  that  ||Z(0)||P  <  oo, 

||Z(m)||p  <  Ca”||Z(0)||p  (4.29) 

From  (4.27)  we  can  immediately  write  the  following. 

Zi{m)  =  $(m,0)  2i(°)  (4-30) 

where  the  cardinality  of  the  set  Omti  is  2m  and  $(m,  0)  is  the  state  transition  matrix 
associated  with  F(m).  As  one  would  expect,  it  is  this  matrix  that  controls  the  stability 
properties  of  (4.27),  although  the  structure  of  the  tree  leads  to  important  differences. 

Theorem  4.2  The  system  defined  in  eq.(f.27)  is  lp-exponentially  stable  if  and  only 

if 

2^||$(m,0)||p  <  K'jm  for  all  m  (4.31) 

where  0  <  7  <  1  and  K'  is  a  positive  constant,  and 

1  1 
— I — 
p  q 


(4.32) 


4  SYSTEM-THEORETIC  CONCEPTS 


20 


Proof 

Let  us  first  show  necessity.  Specifically,  suppose  that  for  any  K  >  0,  0  <  7  <  1, 
and  M  >  0  we  can  find  a  vector  z  and  an  m  >  M  so  that 

||$(m,0>||p>/f7m2-T||2||p  (4.33) 

Let  z  and  m  be  such  a  vector  and  integer  for  some  choice  of  K,  7,  and  M,  and 
define  an  initial  sequence  as  follows.  Let  pa,pi,P2, ...  be  a  sequence  with 

OO 

I>f  =  l  (4.34) 

;=o 

Then  let 

poz  0  <  i  <  2m 
piz  2m  <  i  <  2  •  2m 

2,(0)  =  i  (4.35) 

Piz  j2m  <  i  <  (j  +  l)2m 

Note  that 

l|Z(0)K  =  2”lkli;  (4.36) 

Thus,  using  (4.33)-  (4.36) 

||^(m)||J  =  2m14>K0).^ 

>  2mp/fp7mp2ri1£|^||P 
=  2mp/Lp7rnp2rT£2-m||Z(0)||p 
=  I<P7mp\\Z(0)\\;  (4.37) 

Hence  for  any  K,  0  <  7  <  1  and  M  >  0  we  can  find  an  initial  /p-sequence  Z{ 0)  and 

an  m  >  M  so  that 

||Z(m)||p>/f7m||Z(0)||p 

so  that  the  system  cannot  be  /p-exponentially  stable. 


(4.38) 
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To  prove  sufficiency  we  use  two  simple  facts.  First,  (4.27)  is  exponentially  stable 
if  there  exist  0<  /?  <  1  and  K  >  0  so  that  for  each  i 

IN(m)l|p  <  E  IM°)II?)'  (4.39) 

This  follows  by  raising  (4.39)  to  the  pth  power  and  summing  over  i.  Secondly,  for 
any  sequence  of  vectors  Xi  and  any  m  and  j 

II  E  ^IIp<2^(  E  IMP*  (4-4°) 

t  €:  2"m ,  j 

where  =  {j,j  +  1,  ...j  +  2m  —  1}.  To  show  this,  note  that 

ll«  +  %<2i(||a||J+|H|J)i  (4.41) 

Specifically,  since  ||  •  ||£  is  a  convex  function,  we  can  write 

IKjJo  +  (1  -  5)4||?  <  (ijlloll?  +  (1  -  5)l|4||;  (4-«) 

from  which  eq.(4.41)  follows  immediately.  Using  this  we  can  show  (4.40)  by  induction 
on  m.  Note  first  that  (4.40)  is  trivially  true  for  m  =  0  .  Suppose  then  that  for  all  j 
(4.40)  holds  for  a  particular  value  of  m.  If  we  then  sum  a:,-  over  the  two  sets  and 
Im,j 2  where  j2  =  j\  +  2m  we  get 

IK  E  xi  +  E  xi)\\ p  ^  2*(ll(  E  x*llp  +  IK  E  (4-43) 

*€^m,  jij  *€2"m,>2 

Then  by  substituting  into  (4.40)  into  (4.43)  we  get 

II  E  o*<2^(||( E  0.11?+ ||(  E  or.iipi  (4.44) 

and  apply  (4.41),  we  find  that  (4.40)  holds  for  m  +  1  as  well. 

We  can  now  complete  the  proof  of  the  theorem.  By  applying  the  p-norm  to 
eq.(4.30),  using  the  Cauchy- Schwarz  inequality,  and  then  (4.40)  we  obtain 

IMro)||p  <  ||$(m,0)||p2T (  53  ||«,-(0)pp 

j  ,» 


(4.45) 
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If  we  then  assume  that  (4.31)  holds  this,  together  with  (4.45)  yields 

IMm)||„  <  E  (4-«) 

from  which  we  conclude  that  the  system  is  /p-exponentially  stable. 

□ 

Note  that  from  this  result  we  see  that  the  ^-exponential  stability  of  (4.27)  is 
equivalent  to  the  usual  exponential  stability  of  the  system 

£(m)  =  2  ?F(m  —  l)£(m  —  1)  (4-47) 

For  example  for  p  =  2,  we  are  interested  in  the  exponential  stability  of 

Z(m)  =  V2F{m  -  l)t(m  -  1)  (4.48) 

If  F  is  constant  this  is  equivalent  to  requiring  F  to  have  eigenvalues  with  magnitudes 
smaller  than 

5  Covariance  Bounds,  Stability,  and  Steady-State 
Behavior 

In  this  section  we  develop  several  system-thematic  results  for  our  fine-to-coarse  filter¬ 
ing  algorithms,  paralleling  those  for  standard  Kalman  filtering,  but  with  several  key 
differences  due  to  the  structure  of  the  dyadic  tree.  We  focus  in  this  section  on  the 
scale-varying  case,  i.e.  the  case  in  which  all  system  parameters  vary  with  scale  only. 
In  this  case  the  Bayesian  filtering  algorithm  of  Section  2  becomes 

£(t|f)  =  x(t\t+)  +  I<(m(t))[y(t )  -  C(m(t))5(t|^+)]  (5.1) 

£(^7|  t)  —  i?(m(t))x(i|t)  (5.2) 


ar(f|t+)  =  P(m(t)\m(t)+)P  1(m(i)|m(t)  +  l)[i(t|to:)  +  x(t\t/3)\  (5.3) 
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with  the  scale-varying  Riccati  equation 

P(m|m+1)  =  F(m+l)P(m+l|m-fl)i?T(m-|-l)-(-G(m-f l)Q(m-t-l)GT(ra-t-l)  (5.4) 

P~x{m\m )  =  2P~1(m\m  +  1)  +  CT(m)R~1(m)C(m)  —  P~l(m )  (5.5) 

where  we  have  combined  the  update  and  fusion  steps  in  (5.5).  Also  F(m(t))  and 
Q{m{t))  are  given  by  (2.8),  (2.10)  in  the  scale-varying  case  and 

G(m)  =  A~1(m)B(m)  (5.6) 

Furthermore,  the  remaining  quantities  needed  in  (5.1)-(5.2)  are 

P-X(m|m+)  =  2P-1(m\m  +  1)  -  P;l{m)  (5.7) 

K(m)  =  P(m|ro)C'T(m)i?-1(m)  (5.8) 

In  the  ML  case,  with  P~x  set  to  zero  we  obtain  a  further  simplification  of  these 
equations  or,  equivalently,  of  (3.14): 

XML(t\t)  =  |(/  -  KML{rn(t))C{m{t))A~l(m{t)  +  l)(xML(ta\ta)  +  XML{tp\t(3 )) 

+  KML(m(t))y(t)  (5.9) 

Similarly  we  have  the  following  simplified  form  of  (3.18)  for  the  ML  filter  error: 

XML(t\t )  =  ^(/  -  AriWL(m(t))C(m(t)))A_1(m(t)  +  l)(xML{ta\ta)  +  XML(tfi\t/3 )) 

-  “  KMIl(m(t))C(m(t)))G(m(t)  +  1  )(w(ta)  +  w(0 ))  -  KML(m(t))v(t) 

(5.10) 


The  ML  Riccati  equation  in  this  case  becomes 
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-Pmx(»™|»71  +  1)  =  A  1(m  +  l)P(m+l|m  +  l)A  T(m  +  l)  +  G(m  +  l)(7r(m-f  1)  (5.11) 

=  2P'ML{rn\rn  +  1)  +  CT(m)R~1(m)C(m)  (5.12) 

Also 

Kml( m)  =  PML(m|m)C'r(m)i?_1(m)  (5.13) 

and  also,  for  future  reference, 

Piv/L(m|m+)  =  ^PML(m\m  +  1)  (5.14) 

and  this,  together  with  (5.13)  yield 

^[/  -  KML(m)C{m)]  =  PML(m\m)PML(m\m  +  1)  (5.15) 

5.1  Bounds  on  the  Error  Covariance 

In  this  section  we  derive  upper  and  lower  bounds  on  the  error  covariances  P(m\m) 
and  PML,(m \m)-  As  is  the  case  for  standard  Kalman  filtering,  [3,8],  reachability  and 
reconstructibility  conditions  are  key  in  this  analysis.  In  this  case  the  system  to  be 
analyzed  is  the  following  backward  model,  obtained  directly  from  (2.7)-(2.10)  in  the 
scale-varying  case: 

x(t)  =  ^F(m(t)  +  l)[x(<a)  +  x(tP)]  +  ^G(m(t)  +  l)[tt>(fa)  +  w(tfi)]  (5.16) 

together  with  the  measurements  (2.2).  In  this  case,  accounting  for  the  scaling  factor 
of  \  in  (5.16)  and  the  covariances  of  w  and  v,  we  define  the  stochastic  reachability 
Grammian: 

—  a  M~l 

7 Z(t,M)  =  y;  2~*~  V(ro(t),  m(t)  +  i)G(m(t)  +  i  +  1) 

i=o 

x  Q(m(t)  +  i  +  l)GT(m(t)  +  i  +  l)<f>T(m(t),m(t)  +  i)  (5.17) 
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and  the  stochastic  reconstructibility  grammian: 

_  .  M 

0(t,M)  —  y;  2 '(j)T(m(t)  +  i,  m(t)  +  M)CT(m(t)  +  Q 

t=0 

x  +  0C(m(t)  +  z)«^(m(t)  +  i,  m(t)  +  M)  (5.18) 

where  the  state  transition  matrix  is  given  by  (4.8)-(4.9).  In  our  analysis  we  also 
assume  that  A(m),  A~1(m),  B(m ),  PI_1(m),  C(m),  R(m ),  and  are  bounded 

functions  of  m.  In  terms  of  our  reachability  and  reconstructibility  grammians  these 
assumptions  mean  that  for  any  M0  >  0  we  can  find  a,  /3  >  0  so  that 

Tt{t,  M0)  <  al  for  all  t  (5.19) 

O(t,M0)  <  fll  for  all  t  (5.20) 

Also  uniform  reachability  in  our  present  context  corresponds  to  the  existence  of 
7,  M0  >  0  so  that 

~R(t,Mo)  >  7 1  for  all  t  (5.21) 

while  uniform  reconstructibility  corresponds  to  the  existence  of  8,  M0  >  0  so  that 

0{t,  M0)  >  81  for  all  t  (5.22) 

These  conditions  coincide  with  those  in  Section  4.1  with  the  replacement  of  F(m) 
by  | F(m),  G(m )  by  ^G(m)Q%(m),  and  G(m)  by  R~?  (m)C(m).  Furthermore,  thanks 
to  the  boundedness  assumptions,  the  relationship  between  the  original  model  (2.1)  in 
the  scale-varying  case  and  the  reverse  model  (2.7)-(2.10),  and  the  analysis  in  Section 
4.1,  it  is  straightforward  to  show  that  the  uniform  reachability  and  reconstructibility 
conditions  are  equivalent  to  the  usual  conditions  for  the  pairs  (A~x(m), G(m))  and 
(R~2  ( m)C(m ),  A_1(m)),  respectively. 

We  are  now  in  a  position  to  derive  an  upper  bound  for  the  optimal  filter  error 
covariance,  P(m\m).  The  general  idea  in  deriving  this  bound  is  to  make  a  care¬ 
ful  comparison  between  the  Riccati  equations  for  our  optimal  filter  and  the  Riccati 
equations  for  the  standard  Kalman  filter.  First  consider  the  following  lemma. 
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Lemma  5.1  Let  P{m\m)  be  the  solution  to  the  Riccati  equation  (5.f)-(5.5)  and  let 
P(m\m)  satisfy  the  second  Riccati  equation 


Then 


P(m\m  +  1)  =  F(m  +  l)P(m  +  l|m  +  l)FT(m  +  1) 

-f-  G(m  -j-  \}Q(rn  -f- 1  )G'^(m  +  1)  (5.23) 

P  1(m|m)  =  P  1(m\m  +  1)  +  CT(m)R~1(m)C(m)  (5.24) 

P  1(m|m)  <  P~1(m\m )  (5.25) 


Proof 

First  note  that  (5.5)  can  be  rewritten  as 

P_1(m|m)  =  P_1(m|m  +  1)  +  CT(m)R~1(m)C(m)  +  DT(m)D(m)  (5.26) 


since  P(m|m  +  1)  <  Px(m).  Also,  the  Riccati  equation  (5.23),  (5.24),  characterizes 
the  error  covariance  for  the  optimal  filter  corresponding  to  the  following  standard 
filtering  problem. 


x(m )  =  F(m  +  l)x(m  4- 1)  +  G(m  +  l)w(m  +  1)  (5.27) 

E[w(m)wT(m )]  =  Q(m)  (5.28) 

y(m)  =  C(m)x(m)  +  v(m)  (5.29) 

E[v(m)vT(m )]  =  R{m)  (5.30) 


Similarly,  the  Riccati  equation, (5. 4),  (5.26),  characterizes  the  error  covariance  for  the 
optimal  filter  corresponding  to  the  filtering  problem  involving  the  same  state  equation 
but  with  the  following  augmented  measurement  equation. 


y(m)  = 


C(m) 

D(m) 


x(m)  +  u(m) 


E[u(m)uT(m)] 


R(m)  0 

0  I 


(5.31) 

(5.32) 


so  that  (5.25)  follows  immediately. 
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□ 

We  now  state  and  prove  the  following  theorem  concerning  an  upper  bound  for 
P(ra|m). 

Theorem  5.1  Suppose  there  exists  0,6,  Mo  >  0  so  that  (5.20)  and  (5.22)  are  satis¬ 
fied.  Then  there  exists  k  >  0  such  that  for  all  m  at  least  M0  levels  from,  the  initial 
level  P{m\m )  <  kI . 

Proof 

As  we  have  discussed,  (5.20),  (5.22)  are  equivalent  to  the  existence  of  analogous 
uniform  upper  and  lower  bounds  on  the  observability  Gramian  for  (5.27)-(5.30).  Thus 
standard  Kalman  filtering  results  imply  that  there  exists  a  k  >  0  such  that  P(m\m)  < 
kI  or  P  1(m|m)  >  k-1/.  From  Lemma  5.1  we  then  have  that  P~x(m\m)  >  k~xI  or 
P(m|m)  <  kI. 
a 

We  can  easily  apply  the  previous  ideas  to  derive  an  upper  bound  for  PML{m\m)  as 
well:  Specifically  note  that  the  identical  idea  used  in  Lemma  5.1  yields  an  analogous 
result  for  the  ML  Riccati  equation  (5.11),  (5.12),  i.e. 

/>-1(m|m)  <  Pml(^I^)  (5.33) 

where  P(m\m)  is  the  solution  of  a  Riccati  equation  as  in  (5.23),  (5.24)  but  with  F  and 
Q  replaced  by  A-1  and  I,  respectively.  Then,  as  we  have  discussed,  the  conditions 
(5.20),  (5.22)  are  equivalent  to  the  analogous  conditions  on  the  usual  observability 
gramian  for  the  pair  (R~^(m)C(m),  A-1(m)).  This  in  turn  yields  an  upper  bound  on 
P(m\m).  Using  (5.33)  we  then  have  the  following 

Theorem  5.2  Suppose  that  there  exists  0,6,  Mo  >  0  so  that  (5.20)  and  (5.22)  are 
satisfied.  Then  there  exists  k'  >  0  such  that  for  all  m  at  least  Mo  levels  from  the 
initial  level  Pml(^|”^)  <  «'/. 

We  now  turn  to  the  lower  bound  for  P(m\m).  We  begin  with  the  following 
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Lemma  5.2  Let 

S(m\m)  =  ^(P-1(m|m)  —  CT(m)R~1(m)C(m)  +  P~1(m))  (5.34) 

S(m\m  +  1)  ==  F~T(m  +  l)P_1(m  +  l|m  +  l)F~1(m  +  1)  (5.35) 

Consider  also  the  Riccati  equation 

5*(m|m  +  l)  =  2F~T(m  +  l)S'*(m  +  l|m  +  l)F~l(m  +  1) 

+  F~T(m  +  l)CT(m)R~1(m)C(m)F~1(m  +  1)  (5.36) 

S*  1  (m\m)  —  S*  1  (m|m  +  1)  +  G{m  +  l)Q{m  +  l)GT(m  +  1)  (5.37) 

where  S'(OIO)  —  S*(0|0).  Then  for  all  m ,  5*(m|m)  >  S(m|m). 

Proof 

A  straightforward  calculation  using  (5.5)  and  (5.34)  yields 

S'(m|m)  =  P-1(m|m  +  1)  (5.38) 

Then  using  (5.4)  we  arrive  at 

S'(mlm)  =  [F(m  +  l)P(m  +  1  |m  +  l)FT(m  +  1) 

+  G(m  +  V)Q{m  +  1)GT  (m  +  1)]  1 

=  [S'  1(m\m  -f  1)  +  G(m  +  l)Q(m  +  l)GT(m  -f  1)]_1  (5.39) 

where  the  the  last  equality  results  from  the  substitution  of  (5.35).  Also,  by  substi¬ 
tuting  (5.34)  into  (5.35)  and  collecting  terms  we  obtain 

S’(m|m  +  1)  =  2F~T(m  +  l)S'(m  +  l|m  +  l)jF-1(m  +  1) 

+  F~T(m  +  l)CT  (m)R~1  (m)C  (m)F~1  (m  +  1) 

-  F~T(m  +  l)P~1(m)F~1(m  +  1)  (5.40) 

Now  we  prove  by  induction  that  for  all  m  S'*(m|m)  >  S'(m|m).  Obviously,  5*(0|0)  > 
5(0|0).  As  an  induction  hypothesis  we  assume  S*{i  +  l|z  +  1)  >  S(i  +  1  |z  4- 1).  From 
(5.40),  (5.36),  and  the  fact  that  F~T(m  +  +  1)  >  0  we  get  that 

S*~‘ (i\i  +  1) 


(5.41) 
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Substituting  (5.37)  and  (5.39)  into  (5.41)  and  canceling  terms  we  arrive  at  S*  1  (i  |«)  < 
S  1  (z [*'),  i.e.  5'*(i|i)  >  S'(ili). 

□ 

Theorem  5.3  Suppose  that  there  exists  a,7,M0  >  0  so  that  (5.19)  and  (5.21)  are 
satisfied.  Then  there  exists  L  >  0  such  that  for  all  m  at  least  Mo  levels  from  the 
initial  level  P{m\m)  >  LI. 

Proof 

From  standard  Kalman  filtering  results  we  know  that  the  solution  to  the  standard 
Riccati  equation  (5.36),  (5.37)  satisfies  Samira)  <  NI.  for  some  N  >  0  if  the  pair 
{Q^  ( m)GT(m )  ,  F  T(m))  is  bounded  and  uniformly  observable.  However,  by  standard 
duality  results  and  the  boundedness  of  F,  this  is  equivalent  to  the  boundedness  and 
uniform  reachability  of  (F(m),G(m)Q^(m)),  which  in  turn  are  equivalent  to  (5.19), 
(5.21)  .  Then  from  Lemma  5.2  we  conclude  that  S(m\m)  <  NI. 

Thus  (5.34)  together  with  the  boundedness  assumption  yields 

p-\m\m)  <  IT1/  (5.42) 

Using  analogous  arguments  we  can  derive  a  lower  bound  for  Pml("z |m).  Note 
that  with  the  following  definitions  for  S  and  (5.36),  (5.37)  where  the  matrices  F  and 
Q  are  replaced  with  the  matrices  A -1  and  /,  respectively,  Lemma  5.2  still  applies. 

S(m\m)  =  7:(PML(m\m)  ~  CT(m)R~1(m)C(m ))  (5.43) 

.S^mlm  +  l)  =  Ar(m  +  l)P^(m  +  l|m  +  l)A(m  +  1)  (5.44) 

Using  the  same  argument  as  in  the  proof  of  Theorem  5.3  with  our  current  definitions 
for  S  we  get  that 

-  CT(m)R-\m)C(m))  <  NI  (5.45) 

for  N  >  0,  and  the  boundaries  assumption  then  yields 

Theorem  5.4  Suppose  that  there  exists  a,  7,  M0  >  0  so  that  (5.19)  and  (5.21)  are 
satisfied.  Then  there  exists  V  >  0  such  that  for  all  m  Pml(^  I7™)  L'l. 
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5.2  Filter  Stability 

We  are  now  in  a  position  to  analyze  the  internal  stability  of  the  ML  filter  error 
dynamics  (5.10)  ,  i.e.  using  (5.15)  we  are  interested  in  investigating  the  asymptotic 
stability  of  the  autonomous  error  dynamics 


t(t)  =  PML{m(t)\m(t))PM\(m{t)\rn(t)  +  1)A  1(m(t)  +  l)[£(ia)  +  £(t0)\  (5.46) 

In  particular,  we  have  the  following 

Theorem  5.5  Suppose  that  (5.19)-(5.22)  are  satisfied.  Then,  the  ML  error  dynamics 
(5.10),  or  equivalently  (5.46),  are  l2- exponentially  stable. 

Proof 

Based  on  the  analysis  in  Section  4.2,  we  see  that  we  wish  to  show  that  the  following 
causal  system  is  stable  in  the  standard  sense: 

z(m)  —  PML{m\m)P^L{rn\rri  +  -l)v^L4-1(m  +  l)z(m  +  1)  (5-47) 

The  remainder  of  the  analysis  follows  the  line  of  reasoning  used  in  [3].  Specifically, 
thanks  to  Theorems  5.2  and  5.4,  Pml^ |m)  is  bounded  above  and  below  by  positive 
definite  matrices.  Thus  we  can  define  the  following  Lyapunov  function. 

V(z(m),m )  =  zT  (m)Pj(fiL(rn\m)z(rn)  (5.48) 

Let  us  also  define  the  following  quantity. 

z(m)  =  y/2A~l{m -{■  \)z(m  +  l) 

=  PML(m\m  +  l)Pi^1i(m|m)2:(m)  (5.49) 

Substituting  (5.12)  into  (5.48)  followed  by  algebraic  manipulations,  one  gets 

V(z(m),m)  =  zT(rn)(2PTfi(rn\m  +  1)  +  CT(rn)R~1(m)C(rn))z(m) 

=  2zT  (m){P44\{m\m)  —  2P^(m|m  +  1  ))z{m)  —  zT  (m)CT(m)R~1(m)C(m))z(m) 
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+ 

+ 


zT  (m)(2PM\{m\m  +  l))*(m) 

^P^miru  +  1  m  -  %Wt(m|m  +  1)^ 


y/2  y/2 


V2 


~(y/2z(m)  -  PMlL{m\m  +  1  )(y/2z(m)  - 

zT(m)CT(m)R~1(rn)C(m)z(m )  +  -- P^L(m  \m  +  1)-^  ^ 


£(m) , 


(5.50) 


However,  using  (5.49)  we  have  that 

=  zT(m  +  l)A_T(m  +  l)^Mi(mlm  T  +  l)-z(m  +  !) 

<  zT(m  +  l)Pj^L{m  +  l|m  +  l)z(m  +  1) 

=  V(z(m  +  1),  m  +  1)  (5.51) 


where  the  inequality  follows  from  (5.11)  and  application  of  the  matrix  inversion 
lemma.  Thus  it  follows  that 

V(z(m),m)  —  V(z(m  +  l),m  +  1)  <  ~{\/2z{m)  -  ^^-)T P^\{m\m  A  l){\/2z(rn)  - 

—  zT(m)CT  (rn)R~1(m)C(rn)z(m)  (5.52) 

Stability  follows  from  (5.52)  under  the  condition  of  uniform  observability  of  the  pair 
(R~%(m)C(m),  A_1(m)) 

□ 

Let  us  now  examine  the  full  estimation  error  after  incorporating  prior  statistics. 

It  is  straightforward  to  see  that 

*(*10  =  />(m(t)|m(t))(P^1L(m(t)|m(t))5ML(«W  +  P*1  (5.53) 

Thus  we  can  view  x(t\t)  as  a  linear  combination  of  the  states  of  two  upward-evolving 
systems,  one  for  xml{AP)  and  one  f°r  Px1(rn(^))x(^)-  Note  first  that  since  P(m\m)  < 
PML{m\m ) 


\lP(m(t)\m(t))PMl(m(t)\m(t))xML(tlt)jl  <  ||*Afi(<|0ll 


(5.54) 
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and  we  already  have  the  stability  of  the  dynamics  from  Theorem  5.5.  Turn¬ 

ing  to  the  second  term  in  (5.53),  note  that  the  covariance  of  Pfi1  (m(t))x(t)  is  sim¬ 
ply  P~1(m(t)).  By  uniform  reachability  P“1(m(t))  is  bounded  above.  Thus,  since 
P(m(t)\m(t))  is  bounded,  the  contribution  to  the  error  of  the  second  term  in  (5.53)  is 
bounded.  Finally,  our  analysis  also  allows  us  to  conclude  that  the  full,  driven  XMu(t\t) 
dynamics  (5.10)  are  bounded-input,  bounded-output  stable  from  inputs  w  and  v  to 
output  XML(t\t)- 

5.3  Steady-state  Filter 

In  this  section  we  focus  on  the  constant  parameter  case  and  analyze  the  asymptotic 
properties  of  the  filter.  Specifically,  we  have  the  following: 

Theorem  5.6  Consider  the  following  system  defined  on  a  tree. 

x(t)  —  Ax{Py)  +  Bw(t )  (5.55) 

y(t)  =  Cx(t)  +  v(t)  (5.56) 

with  independent  white  noises  w  and  v  having  covariances  I  and  R,  respectively. 
Suppose  that  ( A,B )  is  a  reachable  pair  and  that  (C,  A)  is  observable.  Then ,  the  error 
covariance  for  the  ML  estimator,  PML(m\m),  converges  as  m  — >  — oo  to  Poo,  which 
is  the  unique  positive  definite  solution  to 

Poo  =  ^a-1p00a-t  +  1-ggt 

-  Koo{l-CA-lPooA-TCT  +  l-CGGTCT  +  R)I<1  (5.57) 

where 

Koo  =  PooCtR~1  '  (5.58) 

Moreover,  the  autonomous  dynamics  of  the  steady-state  ML  filter,  i.e. 

e(t)  =  1(7  -  K0oC)A~1(e(ta)  +  e(tfi)) 


(5.59) 
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are  /2-exponentially  stable,  i.e.  the  eigenvalues  of  | (/  —  K00C)A~1  are  less  than  \/2/2 
in  magnitude. 

Proof 

Note  first  that  the  reachability  of  (A,  B)  and  observability  of  ( C ,  A)  are  equivalent 
to  the  reachability  of  (A~X,G)  and  observability  of  (R~*C,  A~x),  respectively. 

Convergence  of  Pjvfz,(ra|m) 

This  will  be  established  if  we  can  show  that  a)  PML(m\m )  Is  monotone-nonincreasing 
as  m  — ►  —00  and  b)  PjwL(m|m)  is  bounded  below.  The  second  of  these  conditions 
comes  directly  from  the  assumptions  of  reachability  and  observability.  The  mono¬ 
tonicity  of  PML(m\m)  follows  from  an  argument  analogous  to  that  used  in  the  stan¬ 
dard  case  [8].  Specifically,  let  P(m;m')  denote  the  solution  to  the  scale-invariant  ver¬ 
sion  of  the  ML  Riccati  equation  (5.11),  (5.12)  initialized  at  m!  with1  P(m';  m!)  =  00. 
That  is  P(m;  m ')  equals  PML{m |ro)  if  the  coarse  level  at  which  we  begin  is  m! .  Since 
the  parameters  of  (5.11),  (5.12)  are  constant,  we  immediately  see  that 

P(m;  m')  =  P(m  —  m!)  (5.60) 

so  that  the  monotonicity  result  we  wish  to  show  is  equivalent  to  showing  that  if 
m\  >  ra2,  then 

P(m;mi)  <  P(m;m2)  for  all  m  <  m2  (5.61) 

However,  the  scale-invariant  Riccati  equation  certainly  preserves  positive  definite  or¬ 
derings  so  that  the  inequality  in  (5.61)  holds  for  m  =  m0,  then  it  must  hold  for  all 
m  <  ra0.  However  at  m  =  m2,  P(m2;  mi)  <  00  =  P(m2;  m2),  so  that  (5.61)  is  in  fact 
true. 

Having  established  the  convergence  of  P/wx(mlm)i  let  us  denote  the  limit  as  fol¬ 
lows. 

lim  PML(m\m)  =  Pc 0  (5.62) 

m— ►  00 

JTo  be  precise  here  we  should  use  the  information  form  of  (5.11),  (5.12)  (see  Appendices  A,B). 
However,  thanks  to  observability  and  reachability,  for  \m  —  m! |  sufficiently  large  P(m;  mr)  is  well- 
defined  and  invertible.  Since  we  are  interested  in  the  asymptotic  behavior  as  m  —  m'  — ►  —00,  the 
argument  given  above  is  valid. 
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It  is  straightforward  to  see  that  P <*,  must  satisfy  (5.57),  which  is  the  steady-state 
version  of  the  constant-coefficient  ML  Riccati  equation  (5.11),  (5.12).  Furthermore, 
by  Theorem  5.4,  P <*,  must  be  positive  definite. 

Exponential  Stability  of  ~(I  —  K00C)A~l 

What  we  need  to  show  is  that  if  P^  is  any  positive  definite  solution  to  (5.57),  then 
each  eigenvalue,  A,  of  ^(I  —  K00C)A~1  has  magnitude  less  than  1,  where  is  given 
by  (5.58).  The  approach  is  a  variation  of  the  proof  for  the  standard  Riccati  equations 
[8].  Specifically,  some  algebra  shows  that  we  can  rewrite  the  Riccati  equation  (5.57) 
in  the  following  form: 

Poo  =  [^(I-K00C)A-1}P^(I-K^C)A-1}T+^(I-K00C)GGT(I-K^C)T+KxRKi 

(5.63) 

Suppose  that  there  exists  an  eigenvalue  with  |A|  >  1.  Then  letting  x  be  the 
associated  eigenvector  of  [^(7  —  K00C)A~l]T ,  we  see  that 

xHPooX  =  \\^xH P^x  +  |  A  \2xhBBtx  +  xH  KooRK^x  (5.64) 

where  xH  is  the  conjugate  transpose  of  x  and  we  have  used  the  fact  that  G  =  A-1  B. 

Since  Poo  >  0  and  |A|  >  1,  we  can  conclude  from  (5.64)  that  BTx  =  0  and  K^x  =  0, 
but  the  latter  of  these  implies  that  ^ A~Tx  =  Arc.  That  is,  we  have  a  vector  x  so 
that 

xH  A~l  =  V2\Hx,  xhB  =  0  (5.65) 

which  implies  that  (A-1,P)  is  not  a  reachable  pair  which  in  turn  contradicts  the 
assumption  that  {A~X,G)  —  (A~1,A~1B),  or,  equivalently,  ( A,B )  is  reachable. 

Uniqueness  of  P ^ 

Consider  Px  and  P2,  both  of  which  are  positive  definite  and  satisfy  (5.57).  Thus, 
for  *  =  1,2 

Pi  =  \A-xPiA-T  +  \GGT 

-  I <i{\cA-xPiA-TCT  +  ]-CGGTCT  +  R)I<J  (5.66) 

I<i  =  PiCTR~l  (5.67) 
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Subtracting  (5.66)  with  i  =  2  from  (5.66)  with  i  =  1  yields 

P1-P2  =  y-(I-K1C)A-1(P1-P2)(^-(I-K1C)A-1)1 

+  A  (5.68) 

where 

A  =  (I< x  -  K2)[\CA-1P2A-TCT  +  \CGGTCT  +  R]{K  1  -  I<2)T  >  0  (5.69) 

Li 

Note  that  we  have  established  the  fact  that  ^(7  —  K\C)A~X  has  eigenvalues  within 
the  unit  circle.  From  standard  system  theory  this  tells  us  that  P\—P2  >  0.  Reversing 
indices  yields  P2  —  P\>  0,  proving  uniqueness. 

□ 

Finally  let  us  comment  on  the  asymptotic  behavior  of  the  Bayesian  error  covari¬ 
ance  P(m|m),  which  is  given  by 

P(m\m)  =  [PML(m\m)  +  P'^m)]-1  (5.70) 

Since  the  original  state  process  is  defined  evolving  from  coarse-to-fine  while  the  re¬ 
cursion  of  the  ML  filter  is  in  the  opposite  direction,  we  need  to  be  a  bit  careful  about 
defining  exactly  what  we  mean  by  the  asymptotic  behavior  of  (5.70).  Specifically, 
what  we  mean  here  is  its  asymptotic  behavior  at  a  finite  value  of  m  as  both  the 
bottom  and  top  levels  of  the  tree  recede.  Note  that  while  the  convergence  of  P^m) 
depends  upon  the  stability  of  A,  the  convergence  of  P~l(m)  does  not.  Specifically, 
since  (A,  B)  is  reachable,  it  is  easily  seen  (e.g.  by  examining  the  Riccati  equation  for 
P~x(m)  obtained  from  (2.5))  that  P~l{m)  does  converge  as  m  increases.2  Thus,  if 
we  let  Sx  denote  that  limiting  value,  then  P(m|m)  converges  to  [Pj^  +  S^]-1. 

6  Conclusions 

In  this  paper  we  have  analyzed  in  detail  the  new  class  of  multiscale  filtering  and 

smoothing  algorithms  developed  in  [1],  based  on  dynamic  models  defined  on  dyadic 

2The  two  extreme  cases  being  when  A  is  stable,  so  that  — ►  P~l  where  Px  is  the  positive- 

definite  solution  of  (2.6),  and  when  A~l  is  stable,  in  which  case  P~l(m)  —*  0. 
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trees  in  which  each  level  in  the  tree  corresponds  to  a  different  resolution  of  signal  rep¬ 
resentation.  In  particular,  this  framework  leads  to  an  extremely  efficient  and  highly 
parallelizable  scale-recursive  optimal  estimation  algorithm  generalizing  the  Rauch- 
Tung-Striebel  smoothing  algorithm  to  the  dyadic  tree.  This  algorithm  involves  a 
variation  on  the  Kalman  filtering  algorithm  in  that,  in  addition  to  the  usual  measure¬ 
ment  update  and  (fine-to-coarse)  prediction  steps,  there  is  also  a  data  fusion  step. 
This  in  turn  leads  to  a  new  Riccati  equation.  As  we  have  seen  the  presence  of  the  data 
fusion  step  leads  to  a  complication  in  filter  and  Riccati  equation  analysis,  and  this 
motivated  the  derivation  in  this  paper  of  an  alternative  ML  algorithm  which  leads  in 
turn  to  a  variation  on  the  RTS  procedure  corresponding  to  the  triangularization  of 
the  Hamiltonian  description  of  the  optimal  smoother. 

The  major  emphasis  of  this  paper  is  on  the  development  of  system-theoretic  con¬ 
cepts  of  reachability,  reconstructibility,  and  stability  for  fine-to-coarse  dynamic  mod¬ 
els  which  we  then  used  to  analyze  the  multiscale  Kalman  filter  error  dynamics  and 
Riccati  equation.  Specifically,  as  we  have  seen,  the  structure  of  the  dyadic  tree  leads 
to  significant  differences  in  these  system-theoretic  concepts  as  compared  to  their 
counterparts  for  standard  state-space  models.  Using  these  concepts,  we  have  deter¬ 
mined  reconstructibility  and  controllability  conditions  under  which  the  solution  to 
the  Riccati  equation  is  bounded  above  and  below,  the  Kalman  filter  error  dynamics 
are  asymptotically  stable,  and,  in  the  constant-parameter  case,  the  Riccati  equation 
solution  converges  to  a  unique,  steady-state  solution. 

As  we  discuss  in  [1]  multiresolution  methods  of  signal  and  image  analysis  are  of 
considerable  interest  in  research  and  in  numerous  applications.  One  of  our  objectives 
in  [1],  the  present  paper,  and  our  paper  [2]  on  multiresolution  realization  theory  is  to 
demonstrate  that  there  is  a  substantial  role  for  the  systems  and  control  community 
in  this  field.  Indeed  it  is  our  opinion  that  there  are  a  broad  range  of  opportunities 
for  further  work  in  both  theory  and  application,  and  it  is  our  hope  that  our  work  will 
help  to  stimulate  activity  in  this  fascinating  and  important  area. 
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A  Appendix 

In  this  appendix  we  derive  several  results  related  to  the  ML  filter  described  in  Section 
3.  The  first  is  to  show  that  XML,(t\t)  as  defined  in  this  section  is  indeed  the  ML 
estimate  of  x(t)  based  on  Yt  and  also  that  it  satisfies  (3.9)  -  (3.10).  To  do  this,  we 
start  by  writing 

Yt  =  Htx(t)  +  6(t)  (A.l) 

where  6(t)  is  a  zero-mean  noise  vector  constructed  from  process  and  measurement 
noises  in  the  subtree  under  t  using  (2.1),  (2.2).  The  ML  estimate  of  x(<)  based  on 
(A.l)  is  precisely  XML(t\t),  while,  using  standard  ML  estimation  results  [4]  x(t\t)  is 
the  ML  estimate  of  x(t)  based  on  (A.l)  together  with  one  additional  “measurement”. 

0  =  x(t)  =  rj(t)  (A. 2) 

where  T](t)  is  a  zero-mean  noise  vector,  independent  of  6{t )  and  with  covariance  Px(t). 
From  this  it  is  straightforward  to  verify  (3.9)-  (3.10). 

To  verify  the  recursive  formulae  (3.1)-  (3.8)  note  that  £ml(*7|0  is  ML  esti¬ 
mate  based  on  Yt  together  with  one  additional  “measurement”  namely  the  dynamical 
relation  (2.1)  between  x(t)  and  x(tj).  Using  results  on  recursive  ML  estimation  [4], 
XML(tl\t)  is,  equivalently,  the  ML  estimate  of  x{Vf)  given  the  “measurement”. 

XML(t\t)  =  A(f)x(tq)  +  w(t)  +  xML(t\t )  (A. 3) 

where  the  estimation  error  xj vfi,(<|t)  is  zero- mean,  independent  of  w(t),  and  with 
covariance  Pml{1  10-  Eqs.  (3.5)-  (3.6)  follow  directly  from  this.  The  fusion  step 
(3.7),  (3.8)  then  follows  directly  from  standard  ML  results  [4]  on  the  fusion  of  ML 
estimates  based  on  disjoint  data  sets  with  independent  noises,  since  XjVf£,(f|fa)  is  the 
ML  estimate  based  on  Ytol  together  with  (2.1)  evaluated  at  ta ,  while  XML(t\tft)  is 
based  on  YtQ  and  (2.1)  evaluated  at  t/3.  Similarly  the  update  step  (3.1)-  (3.4)  follows 
from  the  standard  result  on  incorporating  a  new,  independent  measurement  (namely 
(2.1)). 
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The  second  result  to  be  derived  is  the  equivalence  of  (2.23)  -  (2.25)  and  (3.11) 
-  (3.13).  We  begin  with  the  equivalence  of  (2.24)  and  (3.13).  That  is,  using  (2.8), 
(3.11)  and  its  counterpart  for  P(Py\t)  we  wish  to  verify  that 

(A.4) 

Algebraic  manipulation  of  this  relationship  yields  the  equivalent  form 

PML{tl\t)  =  A~1(t)PML(t\t)A~T(t)  +  A~1(t)Px(t)A~T(t)  -  Px(t*f)  (A. 5) 

Eqs.  (2.4)  and  (3.6)  then  verify  this  equality.  Secondly,  to  verify  the  equivalence  of 
(2.23)  and  (3.11)  we  must  show  that 

(/  -  J(t)F(t))x(t\t)  =  (J  -  J{t)A-\t))xML(t\t)  (A. 6) 

(where  we  have  expressed  one-step  predicted  estimates  in  terms  of  updated  estimates.) 
Using  (3.9),  we  must  show  that 

(i  -  j(t)F(t))(p^(t\t) + p^wr'p&m  =  i  -  (a  .?) 

Rearranging  and  using  (2.8)  and  (3.13),  we  find  that  this  is  equivalent  to  (A.5),  which 
verifies  (A. 6).  Next,  we  must  verify  the  equivalence  of  (2.25)  and  (3.12).  i.e.  we  must 
show  that 


pm  -  j (t)p(t*f\t)JT(t)  =  pML(t\t)  ~  j(t)pML(mJT(t)  (a.8) 

Again  algebraic  manipulations  reduce  (A.8)  to  (A.5),  finishing  this  verification. 

Finally,  straightforward  algebraic  manipulations  on  (3.1)  -  (3.10)  lead  to  an  in¬ 
formation  filter  version  of  the  ML  algorithm.  Specifically,  let  S  denote  the  inverse 
covariance  and  2  the  state  of  the  information  filter,  i.e. 

S{t\t)  =  PMlm,  S(t\t+)  =  P^L(t |*+),  etc.  (A. 9) 

z(t\t)  =  S(t\t)xML(t\t),  z(t\t+)  =  S(t\t+)xML(t\t+),  etc.  (A. 10) 

Then,  we  have  the  following  algorithm 
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z(t\t)  =  z(t\t+)  +  CT(t)R~1(t)y(t )  (A. 11) 

z(Py\t)  =  JT(t)z(t\t )  (A. 12) 

i(f|f+)  =  z(t\ta)  +  z(t\t(3)  (A. 13) 

where  the  recursions  for  the  inverse  covariances  and  a  corresponding  equivalent  ex¬ 
pression  for  J(t )  are: 


S(t\t)  =  S(t\t+)  +  CT(t)R~l(t)C{t)  (A. 14) 

J(t)  =  {I-  B(t)[BT(t)S(t\t)B(t)  +  I}~1BT(t)S(t\t)}A(t)  (A. 15) 

S(Py\t)  =  JT(t)S(t\t)A(t)  (A. 16) 

S(t\t+)  =  S(t\ta)  +  S(t\t0)  (A. 17) 


Note  in  particular  the  simple  form  of  the  fusion  calculations  (A. 12)  ,  (A.17), 
emphasizing  the  fact  that  independent  sets  of  information  are  being  combined.  Also, 
as  indicated  in  Section  3,  this  algorithm,  is  well-defined  when  S  is  singular,  i.e.  when 
insufficient  information  has  been  collected  for  x  to  be  estimable.  In  particular,  the 
initialization  of  the  ML  algorithm  at  the  finest  level  is  given  by 

i(£|£+)  =  0 , 5(£|£+)  =  0  for  all  t  such  that  m(t)=M  (A. 18) 

In  addition,  further  algebra  yields  the  corresponding  version  of  the  smoothing  step 
(3.11)  -  (3.12),  using  only  the  information  filter  quantities  calculated  during  the  up¬ 
ward  sweep: 


xs(t )  =  J(t)xs(t  7)  +  J(t)A  1  (t)B(t)BT  (t)z(t\t) 


(A.19) 
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Ps(t)  =  J(t)Ps(tj)  JT{t)  +  J(t)A~\t)B(t)BT(t)JT(t) 
where  this  is  initialized  at  the  top  of  the  tree  with 


*.(o)  =  p,(0)5(0|0) 

Ps(0)  =  [5(0|0)  +  P-1(0)]-1 


(A.20) 

(A.21) 
(A. 22) 


B  Appendix 

In  this  appendix  we  describe  an  alternate  derivation  of  the  RTS  smoothing  algorithm 
by  introducing  the  Hamiltonian  form  of  the  smoothing  equations  on  the  tree.  For  sim¬ 
plicity  in  exposition  and  notation  we  focus  here  exclusively  on  the  constant  parameter 
case.  The  extension  to  the  general  case  is  straightforward. 

Specifically,  consider  the  model 

x(t )  =  Ax(t  7)  +  Bw(t)  (B.l) 

y(t)  =  Cx(t)  +  v(t)  (B.2) 

where  w  and  v  are  white-noise  processes  with  variances  I  and  R  respectively,  and 
(B.l),  (B.2)  are  defined  on  an  M-level  tree,  i.e.  m{t)  =  0, . . . ,  M ,  with  a  single 
root  node  which  we  denote  by  0.  The  Hamiltonian  form  of  the  smoothing  equations 
can  be  derived  in  several  ways:  using  the  complementary  model  construction  as  de¬ 
scribed,  for  example,  in  [6]  or  by  examining  the  minimization  problem  in  computing 
the  x(t)-trajectory  that  has  maximum  posterior  probability  given  the  data,  the  prior 
statistics  and  those  of  the  noises,  and  the  dynamic  constraint  (B.l).  We  follow  the 
latter  approach  here.  Specifically,  with  x(0)  having  prior  mean  of  0  and  prior  co- 
variance  of  Px( 0),  by  straightforward  adaptation  of  standard  results  we  find  that  the 
optimal  smoothed  estimate  trajectory  xs(t)  is  obtained  by  minimizing  the  following 
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Hamiltonian 

H  =  ’LkyW-Cx(t))TR-I(y(t)-Cx(t))  +  j:NT(t)w{t)  (B.3) 

t  z  t/o  z 

+  ^r(0)P"1(0)x(0)  +  2  AT(f)(x(f)  -  Ar(*7)  -  Bw(t)) 

z  t^o 

with  respect  to  the  state  x ,  the  noise  tu,  and  the  Lagrange  multiplier  A T(t). 

As  in  the  standard  case,  after  we  set  to  zero  the  derivatives  of  H  with  respect 
to  x,  w,  and  A,  we  find  that  we  can  eliminate  w  by  expressing  it  as  a  function  of  A, 
yielding  the  following  optimal  smoothing  equations  for  m(t)  =  1, . . . ,  M: 

A (f)  =  AT[A(ta)  +  A(f/?)]  -  CTR~lCxs{t)  +  CT R'^t)  (B.4) 

xs(t)  =  Axs(t  7)  +  BBT\(t )  (B.5) 

and  the  boundary  conditions3 

*.(0)  =  [P*(0)  +  C'ri2-1C]r1{  Ar[A(0a)  +  A(0^)]  +  CT  R~ly{  0)}  (B.6) 


\{t)  =  0  ,  m{t)  =  M+  1  (B.7) 

Let  us  note  several  points  concerning  these  equations.  First  note  that,  as  in 
the  standard  case,  the  dual  dynamics  for  A  run  in  the  opposite  direction  to  the  x- 
dynamics.  In  this  case,  thanks  to  the  asymmetry  of  the  tree,  the  dual  dynamics 
(B.4)  are  in  the  form  of  fine- to- coarse  dynamics  which  merge  values  as  we  progress 
up  the  tree  (4.1).  Secondly,  by  organizing  the  dynamics  (B.5),  (B.6)  we  can  obtain 
the  Hamiltonian  form  of  the  dynamics  for  m{i)  =  1, . . . ,  M: 
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with  boundary  conditions  given  by  (B.6)  and  (B.7).  Also,  the  matrices  in  (B.8)  are 


A 


-A  0 

-A  0 

CTR~lC  I 


0 


a 


e0  = 


I  -BBT 
0  0 
0  —AT 

0  0 

I  -bbt 

0  -AT 


(B.9) 


(B.10) 


(B.ll) 


While  the  dynamics  strongly  resemble  the  standard  Hamiltonian  equations,  there  is 
a  substantial  difference  due  to  the  fact  that  the  number  of  points  double  as  we  move 
from  one  level  to  the  next  finer  level —  i.e.  (B.8)  involves  one  node  t  but  two  nodes,  ta 
and  t(3 ,  at  the  next  level.  This  asymmetry  in  the  number  of  variables  in  (B.8)  makes 
it  impossible  to  “diagonalize”  the  Hamiltonian— i.e.  to  decouple  the  dynamics  and 
boundary  conditions  into  separate  upward  and  downward  dynamics  driven  by  y{t ) — 
and  thus  there  is  no  two-filter  algorithm  as  in  [6],  [7].  However,  we  can  triangularize 
these  dynamics  and  boundary  conditions  to  obtain  an  RTS  algorithm. 

Specifically,  drawing  inspiration  from  [6], [7]  consider  a  time-varying  transforma¬ 
tion  of  the  following  form 


’  xu  ' 

E? 

II 

X 

X 

i 

X 

t 


(B.12) 


where 


(B.13) 


With  respect  to  the  transformed  variables  xu  and  x  we  now  wish  to  transform  the 
Hamiltonian  dynamics  and  boundary  conditions  into  a  form  in  which  there  is  an 
upward  recursion  for  xu  followed  by  a  downward  recursion  for  xs.  Note  that  we 
are  free  to  multiply  (B.8)  on  the  left  by  an  invertible  matrix,  Sm(t)i  without  losing 


B  APPENDIX 


43 


information.  By  doing  so,  we  wish  to  transform  the  dynamics  into  the  following 
structure. 


Sm(t)AT~l 


m(t) 


where 


X 

xu 

xu 

— 

'  CTR-1y(t )  ' 
0 

X 

t 

X 

at 

X 

pt 

_  0 

S„ 


SmAT-1  = 


Q  Q  T~l  — 


-P-M-1  -P-M-1  I 


0  I 

I  0 

I  0 

L\  Li 

L3  l4 

Fm+l  0 
0  0 
N  Gm+ 1  j 

F-m+l  0 

N  Gm+1 

0  0 


0 

0 


(B.14) 


(B.15) 


(B.16) 


(B.17) 


(B.18) 


Substituting  the  forms  of  (B.13),  (B.15)  into  (B.16)-  (B.18)  yields  the  following  con¬ 
straints  for  Li  -  L4,  N,  Fm,  Gm,  Pm ,  and  Tm: 


L\  —  L-i  —  0  ,  Li  —  L4  =  —A 
N  =  -BBt 


(B.19) 

(B.20) 


rm  =  2P-1  +  ctr~1c 


-1  A-l 


■^171+1  Tm+i  —  —P„  A 


(B.21) 

(B.22) 
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Fm+i  =  P-lA~1BBT-AT  (B.23) 

Gm+i  =  I  +  BBt  Tm+1  (B.24) 

Combining  (B.21)-  (B.23)  yields  a  recursion  for  P~l: 

P =  (A-1(2P-}1  +  CrJR-1C')-1A-T  +  A~1BBtA~t)~1  (B.25) 

which  is  exactly  the  same  as  the  information  form  of  the  ML  Riccati  equation  (5.11), 
(5.12)  in  the  constant  parameter  case-  i.e.  if  we  set 

pm  =  PML(m  \m  +  !)  (B-26) 

then  P~l  satisfies  (B.25)  together  with  the  boundary  condition 

Pm 1  =  0  (B.27) 

Furthermore  in  this  case  from  (B.21)  we  see  that 

rm  =  pML(m\m)  (B.28) 

and,  using  these  identifications  plus  (B.23),  (B.24),  yields 

Fm+1  =  -P-'A-'T-1^  =  -PMlL{m\m  +  1  )A~1PML(m  +  l|m  +  1)  (B.29) 

Gm+i  =  APmArrm+1  =  APML(m\m  +  1  )ATP^L(m  +  l|m  +  1)  (B.30) 


so  that  Fm  =  —  JT(m)  and  Gm  —  AJ~1(m),  where  J  is  defined  in  Section  2. 

Finally,  using  these  expressions  and  the  dynamics  (B.14)  -  (B.18)  yields  the  fol¬ 
lowing  algorithm.  The  filtering  recursion  is  given  by 


xu(t)  =  JT(m(t)  +  l)[xu(ta)  +  xu(t/3)\  +  CTR~1y(t)  ,  m{t)  =  0, . . . ,  M  -  1  (B.31) 
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with  initial  conditions  (using  (B.12),  (B.13),  (B.21)  for  m  =  M,  (B.27),  and  (B.7) 

xu(t)  =  CTR~1y(t)  ,  m(t)  =  M  (B.32) 

Using  the  boundary  conditions  at  t  =  0  yields  the  initial  condition 

MO)  =  [r0  +  P-^O)]-1  *“(())  (B.33) 

for  the  downward  recursion,  which  we  directly  have  from  (B.14)-  (B.18): 

xs(t)  =  J{m{t))xa(t)  +  J(m(t))A~1BBTxu(t)  (B.34) 

Finally,  comparing  (B.31)-  (B.34)  to  (A. 8)-  (A. 10),  (A. 16),  (A.18),  (A. 19),  we  see  that 
this  triangularization  yields  the  information  filter  form  of  the  ML  RTS  algorithm. 


References 

[1]  K.C.  Chou,  A.S.  Willsky,  and  A.  Benveniste,  “Multiscale  Recursive  Estimation, 
Data  Fusion,  and  Regularization”,  submitted  for  publication. 

[2]  A.  Benveniste,  R.  Nikoukhah,  and  A.S.  Willsky,  “Multiscale  System  Theory”, 
Proceedings  of  the  29th  IEEE  Conference  on  Decision  and  Control,  Honolulu, 
HI,  December  1990. 

[3]  J.  Deyst,  C.  Price,  “Conditions  for  Asymptotic  Stability  of  the  Discrete,  Mini¬ 
mum  Variance,  Linear,  Estimator,”  IEEE  Trans,  on  Automatic  Control ,  vol.  13, 
pp.  702-  705,  Dec.  1968. 

[4]  R.  Nikoukhah,  A.S.  Willsky  and  B.C.  Levy,  “Kalman  Filtering  and  Riccati  Equa¬ 
tions  for  Descriptor  Systems” ,  submitted  to  it  IEEE  Trans,  on  Automatic  Con¬ 
trol. 

[5]  H.  E.  Rauch,  F.  Tung,  and  C.  T.  Striebel,  “Maximum  Likelihood  Estimates 
of  Linear  Dynamic  Systems,”  AIAA  Journal,  Vol.  3,  No.  8,  Aug.  1965,  pp.  1445- 
1450. 


REFERENCES 


46 


[6]  M.B.  Adams,  A.S.  Willsky,  and  B.C.  Levy,  “Linear  Estimation  of  Boundary 
Value  Stochastic  Processes:  Part  I:  The  Role  and  Construction  of  Complemen¬ 
tary  Models”  and  Part  II:  1-D  Smoothing  Problems,”  IEEE  Trans,  on  Automatic 
Control ,  Vol  29,  Sept.  1984,  pp.  803-821. 

[7]  T.  Kailath  and  L.  Ljung,  “Two  Filter  Smoothing  Formulas  by  Diagonalization 
of  the  Hamiltonian  Equations,”  Int’l  J.  Control ,  Vol.  36,  1982,  pp.  663-673. 

[8]  B.D.O.  Anderson  and  J.B.  Moore,  Optimal  Filtering ,  Prentice- Hall,  Englewood 
Cliffs,  N.J.,  1979. 

[9]  K.C.  Chou,  “A  Stochastic  Modeling  Approach  to  Multiscale  Signal  Process¬ 
ing,”  M.I.T.  Department  of  Electrical  Engineering  and  Computer  Science,  Ph.D. 
Thesis,  May  1991. 


